perm filename A77.TEX[106,PHY] blob
sn#807779 filedate 1985-11-20 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00002 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 \magnification\magstephalf
C00012 ENDMK
C⊗;
\magnification\magstephalf
\input macro.tex
\def\today{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi
\space\number\day, \number\year}
\baselineskip 14pt
\rm
\line{\sevenrm a77.tex[106,phy] \today\hfill}
\bigskip
\noindent {\bf Systematic Debugging: Case Study.}
\bigskip
As I write this, I have not yet tested the Gaussian elimination program
in the previous section. I~am now going to plan the testing, carry out
the test, and report on my success in locating and correcting any
errors.
Because there are no conditional commands in the algorithm, choosing
a single test case, with {\tt M}~large enough, will test every command.
By choosing a problem with a known solution, it is easy to check the
final results. To keep the arithmetic simple, I~will let all the solutions
($x↓1$ to~$x↓n$) and their coefficients be small integers; since
$x↓0=-1$, I~can then choose
$a↓{10},a↓{20},\ldots,a↓{n0}$ as small
integers that make all the equations balance. I~will try ${\tt M}=3$ and
(using the random number generator on my pocket calculator):
$$\eqalignno{(x↓0&=-1)\cr
x↓1&=3\cr
x↓2&=6\cr
x↓3&=2\cr
\noalign{\smallskip}
a↓{11}&=3\cr
a↓{12}&=6\cr
a↓{13}&=6\cr
\noalign{\smallskip}
a↓{21}&=0\cr
a↓{22}&=9\cr
a↓{23}&=7\cr
\noalign{\smallskip}
a↓{31}&=7\cr
a↓{32}&=1\cr
a↓{33}&=4\cr
\noalign{\hbox{To make the equations balance, I must have}}
a↓{10}&=a↓{11}x↓1+a↓{12}x↓2+a↓{13}x↓3=57\cr
a↓{20}&=68\cr
a↓{30}&=35\cr}$$
Because the program is constructed using only types of commands which
always terminate, I~can expect it to halt even if wrong. Unplanned
termination may occur if a subscript is out of range or if there is
division by zero; my computer system will notify me if these occur, although
I~may then need to work to locate the cause. The program is short enough
(for me, under about fifty lines) that there is a reasonable probability
it is already correct. Therefore I~plan first to execute the program as
it stands on the given data, and see if the results agree with the known
solutions. If so, there is a good chance the program is entirely correct.
If the program gives incorrect results, I~plan to use the fact that it runs
in three stages: an input stage, an elimination stage, and a back-substitution
stage. By checking whether the known solutions satisfy the equations
after the first and second stage, I~can locate the first stage that fails,
and then concentrate attention on it.
Therefore I will add an array {\tt Y[0..?]} to the program to hold the
correct solution, initially read in the values $y↓0=1$, $y↓1=3$,
$y↓2=6$, $y↓3=2$, and check that for each~$i$,
$\sum↓{0≤j≤n}a↓{ij}y↓j\approx 0$ after each stage. The procedure
{\tt CHECK} performs this test, and reports its findings.
\smallskip
{\obeylines\obeyspaces\let =\ \tt
PROCEDURE CHECK (STAGE: INTEGER);
VAR BAD: BOOLEAN; I,J: INTEGER; SUM: REAL;
BEGIN
BAD:=FALSE;
FOR I:=1 TO M DO
BEGIN
SUM:=0;
FOR J:=1 TO M DO
SUM:=SUM+A[I,J]*Y[J];
IF ABS(SUM)>0.001 THEN
BAD:=TRUE
END;
IF BAD THEN
WRITELN('BAD CHECK AFTER STAGE ',STAGE:1);
END
}
\smallskip\noindent
By inserting {\tt CHECK(1)} and {\tt CHECK(2)} after stages~1 and~2 of
the program, I~will find the first failing stage. If an early stage fails,
I~plan to correct its errors before trying to diagnose any later stages.
Commonly, the results of errors in one stage of a program render
incomprehensible even the correct behavior of later stages.
Suppose stage 2 fails. I~will reexamine that part of the program for
mistakes and typographical errors (like~{\tt 1} or~{\tt l} in place of~{\tt I}).
If I~locate no error that way, I~next locate the first elimination that
gives rise to an incorrect equation, by putting a call on {\tt CHECK}
at the end of the {\tt I}-iteration, modified to print the values of~{\tt N}
and~{\tt I}, and to halt the program immediately. With the troublesome
values of~{\tt N} and~{\tt I} in hand, I~can modify the program to print
every value it computes when {\tt N} and {\tt I} have bad values; for
example inserting
\smallskip
{\obeylines\obeyspaces\let =\ \tt
IF N=BADN AND I=BADI THEN
WRITELN(I,J,A[I,J])
}
\smallskip\noindent
after the computation of {\tt A[I,J]}.
Overall strategy in locating errors entails one form of the ``divide and
conquer'' paradigm: when the error lies in a program section which has
several parts, make an experiment which establishes which part fails, and
proceed to test that part. An iterative command is, in effect, a~long
series of copies of a single command, and locating the first one that
fails is a minor variation of the same theme.
Practical considerations govern choices such as how large a part of a program
should be tested at once. If a program is so large that there is no
realistic hope of its being free of errors, it probably should be
tested in pieces. Insertion of printing commands
everywhere allows watching the
program, but may print volumes of instant waste paper, in which the
useful information may be hard to locate. More selective printing can be
very helpful, but requires more preparation. Testing a program on a
large example may be more likely to exercise every part several times,
but determining the correct answers may be hard. Common sense and
experience must be used.
The three-equation test of the Gaussian elimination algorithm is small
enough that a full trace (printing the effect of every assignment)
would be feasible. I~have used more selective tracing only to demonstrate
how to handle larger problems where a full trace is unreasonable,
by concentrating on a particular part of the program during a limited
time span.
\bigskip
\line{\copyright 1984 Robert W. Floyd\hfill}
\line{First draft (not published) February 12, l985.\hfill}
\bye